Data Generation: Incompressible Navier Stokes
In this document, we cover dataset and FIM generation for incompressible Navier Stokes.
Incompressible Naiver Stokes
- We follow dataset generation scheme from Physics-Informed Neural Operator.
- For the purpose of validation, we currently form full Fisher Infromation Matrix and then compute eigenvector.
- Our next step will be low rank approximation or trace estimation so that we don’t have to form the full matrix.
Dataset
Our dataset consists of \(2000\) pairs of \(\{K, S^t(K)\}_{t=1}^8\).
Fisher Information Matrix
- To find the optimal number of observations, \(M\), we visualize eigenvector and vector jacobian product.
- We observe that as \(M\) increases, the clearer we see the boundary of the permeabiltiy, which will be more informative during training and inference. 1
- Given 1 pair of dataset, \(\{K, S^t(K)\}^8_{t=1}\), we get a single FIM.
Computing Fisher Information Matrix for each datapoint
We consider a realistic scenario when we only have access to samples, but not distribution. When \(N\) is number of samples and \(X \in \mathbb{R}^{d \times d}\), neural network model \(F_{nn}\) learns mapping from \(X_i \rightarrow Y_i\). For each pair of \(\left\{X_i, Y_i \right\}^N_{i=1}\), we generate \(\left\{FIM_i\right\}_{i=1}^{N}\).
- \(N\) : number of data points, \(\left\{X_i, Y_i \right\}\)
- \(M\) : number of observation, \(Y\)
\[ \left\{ X_i \right\}^N_{i=1} \sim p_X(X), \: \epsilon \sim \mathcal{N}(0, \Sigma), \: \Sigma = I \] For a single data pair, we generate multiple observations. \[Y_{i, J} = F(X_i) + \epsilon_{i, J}, \quad where \left\{ \epsilon_{i,J}\right\}^{N,M}_{i,J= 1,1}\] As we assumed Gaussian, we define likelihood as following. \[p(Y_{i,J}|X_i) = e^{-\frac{1}{2}\|Y_{i,J}-F(X_i)\|^2_2}\] \[log \: p(Y_{i,J}|X_i) \approx \frac{1}{\Sigma}\|Y_{i,J}-F(X_i)\|^2_2\] A FIM for a single data pair \(i\) is: \[FIM_i = \mathbb{E}_{Y_{i, \{J\}^m_{i=1}} \sim p(Y_{i,J}|X_i)} \left[ \left(\nabla log \: p(Y_{i,J}|X_i)\right)\left(\nabla log \: p(Y_{i,J}|X_i)\right)^T\right]\]
How does FIM change as number of observation increases?
FIM is expectation of covariance of derivative of log likelihood. As we expected, we see clearer definition in diagonal relationship as \(M\) increases.
Making Sense of FIM obtained
Still, does our FIM make sense? How can we better understand what FIM is representing?
Let’s look at the first row of the FIM and reshape it to [64, 64].
- Like we expected from the definition of FIM, we observe each plot is just different linear transformation of \(\nabla log p(\{S^t\}^8_{t=1}|K)\)
- As we will see from below, each rows in FIM is noisy version of its eigenvector.
How does eigenvectors of FIM look like as \(M\) increases?
\(M = 1\) (Single Observation)
- Even when FIM is computed with single observation, we see that the largest eigenvector has the most definition in the shape of permeability. Rest of eigenvector looks more like noise.
\(M = 10\)
\(M = 100\)
\(M = 1000\)
- As \(M\) increases, we observe flow through the channel clearer.
- We see the boundary of permeability gets clearer.
- In general, it gets less noisy.
How does vector Jacobian product look like as \(M\) increases?
- We observe that vector Jacobian product looks more like saturation rather than permeability.
- As \(M\) increases, scale in color bar also increases.
- One possible conclusion:
- vjp tells us the location in the spatial distribution (likelihood space) where there exists the largest variation, thus have the most information on parameter.
- \(J^Tv\), when \(v\) is the largest eigenvector of FIM, is projecting Jacobian onto direction of maximum sensitivity.





